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^We  describe  a  projective  algorithm  for  linear  programming  that  shares  features  with 
Karmarkar's  projective  algorithm  and  its  variants  and  with  the  path-following  methods  of 
Gonzaga,  Kojima-Mizuno-Yoshise,  Monteiro-Adler,  Renegar,  Vaidya  and  Ye.  It  operates 
in  a  primal-dual  setting,  stays  close  to  the  central  trajectories,  and  converges  in  Ofyn  fy) 
iterations  like  the  latter  methods.  (Here  n  is  the  number  of  variables  and  L  the  input 
size  of  the  problem.)  However,  it  is  motivated  by  seeking  reductions  in  a  suitable  potential 
function  as  in  projective  algorithms,  and  the  approximate  centering  is  an  automatic 
byproduct  of  our  choice  of  potential  function. 


1.  INTRODUCTION 

This  paper  is  concerned  with  interior  algorithms  for  the  linear  programming 
problem 


min  g  x 

(Pj)  Ax  =  b 

x>  0, 

1/  n 

where  A  is  a  real  k»n  matrix,  b  is  in  R  ,  and  g  and  x  are  in  R  .  Apart  from 
methods  related  to  the  ellipsoid  algorithm,  we  can  distinguish  two  classes  of  such  methods: 

(i)  projective  algorithms:  Karmarkar's  method  [14]  and  its  variants  including  the 
affine-scaling  variant;  and 

(ii)  path-following  algorithms,  due  to  Gonzaga  [12]  ,  Kcjima,  Mizuno  and  Yoshise 
[15,16],  Monteiro  and  Adler  [21,22],  Renegar  [23],  Vaidya  [28]  and  Ye  [30]. 

We  describe  here  a  method  that  shares  features  with  both  approaches. 

Assume  that  the  data  A,  b  and  g  of  (P^  are  integer,  and  the  input  size  is  L. 
Then  Karmarkar's  algorithm  can  obtain  within  O(nL)  iterations  a  sufficiently  accurate 

solution  that  an  exact  solution  can  easih  be  deduced.  The  work  required  in  each  iteration 

3  2  5 

is  0(n  )  arithmetic  operations  in  the  basic  version,  or  an  average  of  0(n  )  operations 

in  the  modified  algorithm,  giving  a  complexity  of  0(n  L).  Each  iteration  seeks  a 

constant  reduction  in  Karmarkar's  potential  function;  after  making  a  projective 

transformation,  a  step  is  taken  in  a  direction  that  is  the  negative  of  the  projected  gradient 

of  the  potential  function  in  the  transformed  space.  This  constant  reduction  implies  the 

bound  of  O(nL)  for  the  number  of  iterations.  In  practice,  far  fewer  iterations  are 

required;  it  seems  that  O(L),  or  perhaps  0((fn  n)L),  suffice  when  a  reasonable  line 


2 


search  is  performed.  On  the  other  hand,  it  is  known  that  a  reduction  of  the  potential 
function  greater  than  some  fixed  constant  cannot  always  be  achieved;  see  Anstreicher  [2] 
and  McDiarmid  [17]. 

Considering  the  infinitesimal  step  version  of  Karmarkar's  algorithm  and  its 
variants,  one  is  led  to  the  study  of  paths  or  trajectories  in  the  interior  of  the  feasible  region. 
These  paths  have  been  investigated  by  Bayer  and  Lagarias  [5],  Megiddo  [18]  and  Megiddo 
and  Shub  [19].  They  are  closely  related  to  the  paths  associated  with  the  classical  barrier 
function  method  of  Frisch  [11]  and  Fiacco  and  McCormick  [10],  and  also  to  the  notion  of 
"analytic  center"  of  Sonnevend  [24].  Renegar  [23]  gave  an  algorithm  using  Newton's 

9 

method  to  trace  the  path  that  required  0(Vn  L)  iterations,  each  needing  0(n  ) 
arithmetic  operations.  Using  Karmarkar's  trick  of  solving  approximate  systems  of 
equations  and  making  rank  1  updates,  Vaidya  [28]  showed  that  an  average  of  0(n  ) 

9 

arithmetical  operations  per  iteration  sufficed,  giving  an  overall  complexity  of  0(n  L). 
Gonzaga  [12]  independently  and  simultaneously  obtained  the  same  result.  All  these 
methods  operated  in  the  primal  space  alone. 

Kojima,  Mizuno  and  Yoshise  [15]  described  a  primal-dual  interior  algorithm  that 
followed  central  trajectories  in  both  the  primal  and  dual  feasible  regions.  This  method 
used  O(nL)  iterations.  Monteiro  and  Adler  [21]  gave  a  primal-dual  method  that  required 
only  O(^L)  iterations,  and  had  a  complexity  of  0(n  L).  A  similar  complexity  was 
established  by  Kojima  et  al.  [16]  for  a  path-following  method  for  certain  linear 
complementarity  problems,  including  those  arising  from  linear  and  convex  quadratic 
programming.  Monteiro  and  Adler  also  extended  their  method  to  one  for  convex  quadratic 
programming  with  complexity  0(n  L)  [22].  All  these  methods  work  in  a  primal-dual 
setting,  although  there  is  no  additional  computation  compared  to  a  primal-only  method. 
Finally,  Ye  [30]  has  given  a  primal-only  convex  quadratic  programming  algorithm  that 
requires  only  0(Vn  L)  iterations. 
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We  aim  to  shed  light  on  these  two  classes  of  algorithms  by  introducing  and 
analyzing  a  centered  projective  algorithm  for  linear  programming.  Like  Ka;markar's 
method,  it  is  motivated  by  seeking  a  reduction  in  a  suitable  potential  function,  and  the 
search  direction  is  the  negative  of  the  projected  gradient  of  such  a  function  in  a 
transformed  space.  Like  the  path-following  methods,  it  requires  only  0(Vn  L)  iterations, 
operates  in  a  primal-dual  setting,  and  stays  close  (automatically)  to  the  central 
trajectories.  We  analyze  our  method  with  exact  projections,  so  that  it  needs  0(n  ) 
arithmetic  operations  at  each  iteration.  However,  we  shall  see  that  the  direction  generated 
coincides  with  those  of  Monteiro-Adler  and  Kojima  et  al.,  so  their  analysis  shows  that 
inexact  projections  could  be  employed  to  give  an  overall  complexity  of  0(n  L).  We  note 
that  Ye  and  Todd  [31]  have  shown  that  the  path-following  algorithms  maintain  a 
reasonable  decrease  in  a  certain  potential  function;  this  function  differs  slightly  from  the 
one  chosen  in  this  paper. 

One  question  we  particularly  wish  to  illuminate  is  the  following.  The  projective 
methods  define  a  search  direction,  and  the  analysis  then  shows  that  a  constant  reduction  in 
the  potential  function  can  be  achieved;  as  we  have  indicated,  typical  behavior  in  practice  is 
much  more  encouraging.  On  the  other  hand,  the  path-following  methods  try  for  an 
a  priori  determined  reduction  of  (l-7/Vn)  in  the  duality  gap  for  some  constant  7  and 
that  is  what  they  ootain.  Can  one  hope  to  do  better,  and  where  does  the  Jn  term  come 
from?  Some  suggestions  have  been  given  in  earlier  papers;  we  hope  our  approach  yields 
additional  insight. 

The  paper  is  organized  as  follows.  In  section  2  we  describe  Karmarkar's  algorithm 
briefly  and  indicate  why  a  constant  reduction  in  potential  is  obtained.  We  also  outline  the 
step  determination  procedure  of  a  path-following  method.  Section  3  reformulates  the 
primal  and  dual  problems  in  a  symmetric  way  and  introduces  a  combined  primal-dual 
problem. 
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In  section  4  we  state  the  potential  functions  with  which  we  will  be  concerned  and 
discuss  scalings  of  the  primal  and  dual  variables.  Section  5  computes  the  necessary 
projection  matrix  and  hence  the  search  direction  used.  We  also  relate  this  search  direction 
to  the  step  employed  in  path-following  methods.  As  is  perhaps  not  too  surprising  in  view 
of  earlier  analyses  by  Gonzaga  [13]  and  Mitchell  and  Todd  [20],  our  direction  is  a  linear 
combination  of  the  same  two  directions  that  arise  in  path-following  algorithms.  However, 
we  stress  that  the  motivation  behind  it  is  completely  different. 

Section  6  describes  the  algorithm  which  is  analyzed  in  section  7.  Finally  section  8 
contains  further  discussion. 

2.  AN  OUTLINE  OF  PROJECTIVE  AND  PATH-FOLLOWING  METHODS 

Our  problem  is 

min  gTx 

(Pj)  Ax  =  b 

x  >  0, 

where  A  is  k*n,  b  a  k-vector,  and  g  and  x  lie  in  Rn.  We  assume 

(Al)  (P^  has  a  strictly  positive  feasible  solution; 

(A2)  The  set  of  optimal  solutions  of  (Pj)  is  nonempty  and  bounded;  and 

(A3)  The  matrix  A  has  full  row  rank  k. 

To  describe  Karmarkar's  projective  algorithm,  we  assume  for  simplicity  that  no 
optimal  solutions  lie  in  the  relative  interior  of  the  feasible  region  of  (Pj),  that  a  strictly 
positive  feasible  solution  is  known,  say  =  e,  where  e  e  Rn  is  a  vector  of  ones,  and  that 
the  optimal  value  of  (Pj)  is  zero.  Then  it  can  be  shown  that  (P^  is  equivalent  to  the 
homogeneous  problem 
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min  gTx 

(HP ! )  (A,  — b]  (  {  )  =  0 

x  >  0,  (>0,  (x,()#0. 

For  any  problem  (P),  let  F(P)  denote  the  feasible  region  of  (P)  and  F+(P)  the  set  of 
strictly  positive  feasible  solutions.  We  can  evaluate  any  point  (x,£)  6  F+(HPj)  by 
Karmarkar's  potential  function 


^{x,0  =  (n+l)/n  gTx  -  Ej  in  Xj  -  (n  { 


(£) 

+  *>(£) 

l  Xj  ) 

V  Xj  ) 

(2.1) 


Clearly,  <t>^  is  homogeneous  of  degree  0,  and  hence  can  be  viewed  as  defined  on  positive 
rays,  i.e.,  on  certain  points  in  projective  space. 

k  k 

Karmarkar's  algorithm  generates  a  sequence  {(x  ,£  )}  in  F+(HP1)  starting  with 
(x°,<°)  =  (e,l),  with 


^1(xk+1/+1)<^1(xk)fk)-( 


(2.2) 


K 

for  some  fixed  positive  e  and  all  k.  Without  loss  of  generality,  we  can  take  £  =1,  and 


.ki  : 


then  {x  }  is  a  sequence  in  F+(P1).  We  can  then  deduce  from  (2.2)  that 


T  k 


gTxk<(4-)exp(-^)gTx° 


(2  3) 


where  e  denotes  the  vector  of  ones  in  Rn.  If  F(P^)  is  bounded,  we  see  that  gTx^ 
converges  linearly  to  the  optimal  value  of  zero. 
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If  the  data  A,  b  and  g  are  integer  and  the  total  input  size  is  L,  (2.3)  can  be  used 
to  show  that  O(nL)  iterations  suffice  to  give  an  approximate  solution  from  which  an  exact 
solution  can  be  obtained  by  solving  a  system  of  equations.  In  general,  O(nq)  iterations 
reduce  the  objective  function  by  a  factor  of  2^. 

We  now  describe  how  the  reduction  in  potential  given  by  (2.2)  is  achieved.  Let  x 
in  F+(Pj)  be  given,  and  let  E:=diag(x)  be  the  diagonal  matrix  whose  diagonal  entries 
arc  the  components  of  x.  Let  g  :=  Eg  and  A  :=  AE,  and  consider  the  rescaled  probelm 

min  gTx 

(HFj)  [A,-b](?)=0 

x>0,  {>0,  (x, O?0. 

in  terms  of  the  rescaled  variables  x  =  E-1x,  £  =  £.  It  is  easy  to  see  that,  if  (x,£)  and 
(x,£)  correspond  as  above,  then 


0j(x,O  -  (n  det  E  =  ^(x,|) 

:=  (n+l)^n  gTx  -  Ej  Xj  -  (2.4) 


and  that  (x,l)  corresponds  to  the  transformed  vector  (e,l).  From  (2.4)  it  is  sufficient  tc 
obtain  a  constant  reduction  in  ^  from  the  solution  (e,l).  We  do  this  by  making  a  step  in 
the  direction  given  by  the  negative  of  the  projected  gradient  of  4>y  We  find 


=  V^(e,l)  = 


n+1  - 
— -  g-e 

g  e 

-1 


(2.5) 


7 


If  PM  denotes  projection  into  the  null  space  of  a  matrix  M,  then  our  direction  is 


3  :=  -P[A,-b]  7*1 

n+1  . 
—  8 
!  e 

0 


—  _  p 

ITe 


If  we  move  from  (e,l)  to  (e,l)  +  o3/||3||,  we  find 


#j(e,l)  +  a3/l|3||)  =  01(e,l)  +  o3TV#j/||3||  +  O(o2) 

-  ^(e.1)  -  oliail  +  0(o2).  (2.7) 


It  is  clear  that  a  constant  stepsize  a  can  be  chosen  without  violating  positivity.  Hence, 
since  the  higher  order  terms  can  be  controlled,  the  reduction  in  ^  will  be  of  the  same 
order  as  the  Euclidean  norm  of  J.  In  the  proof  of  their  lemma  3.1,  Todd  and  Burrell  [27] 
show  that  ||3||  >  1.  This  yields  a  constant  decrease  in  ^  per  iteration,  and  hence 
convergence  in  O(nL)  iterations.  Greater  reduction  in  and  hence  faster  convergence, 
would  follow  if  ||d||  could  be  shown  to  increase  with  n.  We  will  return  to  this  question  in 
section  4. 

After  making  a  step  in  projective  space  to  get  a  new  (x,^),  we  normalize  to 
(x/|,  1)  (note  that  ^  is  unchanged),  and  then  rescale  to  get  our  new  feasible  solution 
Ex/|  to  (Pj).  Karmarkar's  algorithm  iterates  this  procedure. 
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An  alternative  method,  the  so-called  affine  variant  (first  proposed  by  Dikin  [6]  in 
1967;  see  also  [7],  Barnes  [3]  and  Vanderbei,  Meketon  and  Freedman  [29]),  works  directly  in 
the  original  affine  space.  The  rescaled  variable  x  =  Z_1x  is  moved  from  e  in  the 
direction 

ax  =  -paj.  (2.8) 

Once  again,  we  can  use  a  constant  stepsize  a  in  the  update  x  *-  e  +  <>3X/PXI|,  and  this 
will  yield  a  decrease  proportional  to  ||3X||  in  the  objective  function  value.  However,  no 
lower  bounds  on  ||dx||  have  been  established  and  polynomial  convergence  has  not  been 
shown  for  this  variant  (and  is  believed  unlikely;  see  Megiddo  and  Shub  [19]). 

Before  we  turn  to  path-following  methods,  we  remark  on  a  drawback  of  the 
potential  function  <py  Assume  that  x  is  converging  to  the  unique  optimal  solution  x* 
to  (Pj),  and  assume  that  this  is  a  nondegenerate  basic  feasible  solution.  Let  0  (0)  index 
the  basic  (nonbasic)  components  of  x*.  Let  g  denote  the  reduced  costs  corresponding  to 
the  optimal  basis,  so  that  g.  is  positive  for  j  t  0.  Then 

J 

0x(x,l)  =  (n+1)  in  gTx  -Ej  in  Xj 

=  (k+1)  in  gTx  +  E  fn  ( si*  )  -  £  (n  x 

0  {  xj  '  0  J 

(there  will  be  k  basic  indices);  as  x  -•  x*,  each  term  fn(gTx/xj)  >  in  gj  (j  e  0)  is 
bounded  below,  and  each  term  fn  x.  (j  €  ff)  is  converging  to  in  xt  >  -x.  Hence  a 

J  J 

constant  reduction  in  <j>^  implies  an  average  reduction  of  a  factor  (l-7/(k+l))  for  some 
constant  7  in  the  objective  function.  However,  this  result  is  very  dependent  on  primal 
and  dual  nondegeneracy.  It  is  not  clear  that  the  factor  n+1  multiplying  in  gTx  gives 
the  correct  balance  between  reducing  the  objective  function  and  staying  away  from  the 
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constraint  boundaries;  and  finding  the  right  balance  might  necessitate  knowing  the  degree 
of  primal  and  dual  degeneracy  at  the  solution.  We  will  see  that  this  disadvantage  vanishes 
for  our  method. 

We  now  turn  to  an  outline  of  path-following  methods.  We  discuss  a  symmetric 
primal-dual  variant,  as  in  [21].  Associated  with  problem  (P^)  is  the  barrier  function 
problem  with  parameter  /j: 


min  gTx  -  /i  £j  in  Xj 

(BPj)  Ax  =  b 

x  >  0; 

under  our  assumptions,  this  will  have  a  unique  solution  x(jt)  for  each  /x  >  0,  and  x(/j) 
converges  to  an  optimal  solution  of  (P^  as  p -*()+.  The  optimality  conditions  for 
(BPj),  with  Lagrange  multipliers  y,  can  be  written  as 

g-/iX-1e-  A1y  =  0 
Ax  =  b 

where  X  denotes  diag(x),  and  S  below  similarly  denotes  diag(s).  Letting  s  :=  /nX~^e. 
we  can  write  these  conditions  as 


XSe  -  fje  =  0 

(2.9a) 

Ax  =  b 

(2.9b) 

ATy  +  s  =  g 

(2.9c) 

x  >  0,  s  >  0. 

(2.9d) 
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Thus  y,  with  associate  slack  vector  s,  is  an  interior  solution  to  the  dual  problem 

max  bTy 

(Dj)  ATy  +  s  =  g 

s  >  0, 

and  moreover  the  duality  gap  is  gTx  -  bTy  =  xTs  -•  eTXSe  =  n/j.  The  equations  (2.9)  are 
also  the  optimality  conditions  for  the  barrier  function  problem 

max  bTy  +  til  ins. 

J 

(BDj)  ATy+s  =  g 

s  >  0, 

with  unique  optimal  solution  (y(/i),s ((i)).  Hence,  solving  (2.9)  for  a  sequence  of 
parameters  n  approaching  zero  yields  points  on  the  curve  (x(/i)}  in  the  primal  space  and 
on  the  curve  {(y(p),s(/i))}  in  the  dual  space.  These  are  the  central  trajectories  of  Bayer 
and  Lagarias  [5]  and  the  (central)  pathways  of  Megiddo  [18].  We  call  (x,y,s)  or  (x,s) 
centered  if  (2.9)  holds  for  some  (i. 

Equation  (2.9a)  is  nonlinear,  so  we  must  be  content  with  approximate  solutions. 
Assume  we  have  a  point  (x,y,s)  satisfying  (2.9b)-(2.9d)  with 

||XSe  -  Cell  <  a<  (2.10) 

where  C  :=  eTXSe/n  and  suitable  0  <  a  <  1;  such  a  point  we  will  call  approximately 

centered.  We  then  seek  an  approximate  solution  to  (2.9)  for  some  n  <  (  by  taking  a 

Newton  step.  The  new  point  will  be  (x+d  ,  y+d  ,  s+d  )  where  the  direction  satisfies  the 

x  y  s 

equations 
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Sd  +  Xd  =  -{XSe  -  /*) 

X  9 

Adx  =  0  (2.11) 

ATdy  +  ds  =  0. 

It  can  be  shown  (see,  e.g.,  Monteiro  and  Adler  [21])  that  the  result  is  also  approximately 
centered,  with  duality  gap  approximately  n/i,  if  we  choose  \x  —  (1  -  for  some 

constant  7. 

Most  of  the  cited  papers  give  little  or  no  motivation  or  justification  for  the  choice  of 
H  above.  However,  Gonzaga  [13]  in  his  closely— related  primal-only  method,  shows  in 
section  3.3  that  such  a  n  gives  a  constant  bound  on  ||X”*d  ||,  thus  ensuring  that  the 
new  point  is  feasible.  Kojima,  Mizuno  and  Yoshise  [15]  set  n-  cC,  for  some  constant 
a  <  1  and  find  that  they  can  only  take  a  step  size  of  order  1/n,  resulting  in  O(nL) 
iterations.  They  also  show  (Theorem  2  and  the  following  remark)  that,  even  if  one  is 
exactly  on  the  path  {(x(/i),  y(/x),  s(m))}  and  takes  a  step  in  the  tangent  direction  for  the 
maximum  feasible  distance,  then  the  duality  gap  is  reduced  by  at  least  the  factor 
(l-i/vff). 

3.  SYMMETRIC  PRIMAL  AND  DUAL  PROBLEMS 

In  this  section  we  reformulate  (Pj)  and  (Dj)  into  symmetric  forms,  and  define  a 
symmetric  combined  primal-dual  problem.  Recall  that  we  have 

min  gTx 

(Pj)  Ax  =  b 

x  >  0, 

satisfying  (Al)— ( A3).  Since  we  assume  (P^)  feasible,  b  is  in  the  range  of  A,  so  that  we 


can  write 
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b  =  Ah  (3.1) 

for  suitable  h.  then  the  dual  can  be  written 

max  hTATy 

(Dj)  ATy  +  s  =  g 

s  >  0. 

We  prefer  to  write  this  in  terms  of  s  alone.  Thus  let  B  be  a  matrix  whose  rows 
span  the  null  space  of  A;  by  choosing  a  basis  for  this  space  we  can  assume  that  B  is 
i  *  n  with  full  row  rank  t,  with  k+£  =  n.  Then 

ATy  +  s  =  g  or  some  y  iff  Bs  =  Bg 

and  moreover,  the  objective  function  of  (Dj)  can  be  written  in  terms  of  s  alone  since 
hTATy  =  gTh  -  hTs.  Thus  we  can  rewrite  (Pj)  and  (Dj)  as 

min  gTx 

(P)  Ax  =  Ah 

x  >0 

min  hTs 

(D)  Bs  =  Bg 

s  >0 

where  the  rows  of  A  and  B  span  complementary  orthogonal  subspaces  of  Rn,  which  we 


denote  as 
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A  i  B.  (3.2) 

The  duality  relation  between  (P)  and  (D)  is  easily  deduced  from  the  standard 
result:  feasible  solutions  x  and  s  have  objective  values  satisfying 

gTx  +  hTs  >  gTh  (3.3) 

and  are  optimal  if  and  only  if  equality  holds.  Indeed,  x— h  and  s-g  lie  in  the  null  spaces 
of  A  and  B  respectively,  and  are  therefore  orthogonal,  so 

gTx  +  hTs  -  gTh  =  xTs,  (3.4) 


which  is  nonnegative  for  feasible  x,  s. 

The  standard  symmetric  inequality-form  linear  programming  problems  are 
naturally  included  in  our  format  above.  If  they  are  written  as  max  cTx,  Ax  <  b,  x  >  0 
and  min  bTy,  ATy  >  c,  y  >  0,  then  we  set 

A  =  (A,  I),  B  =  (I,  — At) 

*  =  (  “o  )  “d  h=(b)  <3'5> 

to  get  corresponding  instances  (P)  and  (D). 

We  can  combine  (P)  and  (D)  to  get  the  primal-dual  problem 

min  gTx  +  hTs  -  gTh 

(PD)  Ax  -  Ah  =  0 

Bs  -  Bg  =  0 


x  >  0,  s  >  0 
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which  has  optimal  value  0  when  feasible.  As  stated,  this  problem  is  separable,  but  we 
may  replace  the  objective  function  by  xTs  using  (3.4).  In  this  form,  it  is  clear  that  (PD) 
is  unaffected  if  we  replace  g  by  g  and  h  by  h  where 

Ah  =  Ah  and  Bg  =  Bg.  (3.6) 


Indeed,  we  have  the  useful 

Lemma  l.  Let  (P)  and  (D)  denote  (P)  and  (D)  defined  using  g  and  h  in  place  of  g 
and  h,  where  (3.6)  holds.  Then  (P)  and  (D)  are  equivalent  to  (P)  and  (D)  in  the 
sense  that  their  feasible  regions  are  unchanged  and  their  objective  functions  the  same  up  to 
additive  constants. 

Proof.  It  is  clear  that  the  feasible  regions  are  unchanged.  Now  (3.6)  implies  that 
g  s=  g  +  Atu  and  h  =  h  +  BTv  for  some  u,v.  Thus,  for  any  feasible  x, 

gTx  =  gTx  +  uTAx  -  gTx  +  uTAh  =  gTx  +  (gTh  -  gTh) 

differs  by  a  constant  from  gTx.  Similarly,  hTs  differs  by  a  constant  from  hTs  for 
feasible  s. 

We  will  use  lemma  1  as  follows;  given  (x,s)  6  F(PD),  we  can  assume  without  loss 
of  generality  that  g  -  s  and  h  =  x. 

Let  us  observe  immediately  one  consequence  of  this.  Suppose  that  x  -  s  =  e  is 
feasible  in  (PD).  This  solutioo  is  also  centered,  since  (2.9a)  holds  with  n  =  1.  Then  we 
can  take  g  =  h  =  e  also.  The  direction  chosen  by  the  affine  algorithm  (see  (2.8))  is  then 


Recall  that  the  difficulty  with  the  affine  variant  was  in  bounding  from  below  the  norm  of 
||dx||;  even  if  -g  were  large  (e.g.,  g  =  e),  its  projection  dx  into  the  null  space  of  A 
could  be  small.  Here  this  problem  vanishes;  since  Ai  B,  d  +  d  =  -e  and  we  can 
conclude  that 

l|dj|  =  Vn.  (3.8) 

Note  that  centering  alone  is  not  enough  in  this  analysis — we  need  also  the  primal-dual 
formulation.  In  computing  d  from  (3.7),  the  increased  dimension  implies  no  increase  in 
work;  given  dx  =  -P^e,  we  find  dg  =  -Pge  =  -e  —  dx,  since 

PA  +  PB  =  I.  (3.9) 

The  importance  of  centering  in  the  affine  variant  has  been  demonstrated  also  in  Barnes  and 
Jensen  [4]. 

Since  (PD)  has  optimal  value  0,  it  is  natural  to  consider  its  homogenization 

min  gTx  +  hTs  -  gThr 

(HPD)  Ax  -  Ahr  =  0 

Bs  -  Bgr  =  0 
x  >  0,  s  >  0,  r  >  0 
(x,  s,  t)  t  0. 

As  for  (HPj),  we  can  view  the  feasible  solutions  of  (HPD)  as  rays,  i.e.,  as  points  in 
projective  space.  The  relation  equivalent  to  (3.4)  for  (x,s,r)  €  F(HPD)  with  r>0  is 
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gTx  +  hTs  -gThr  =  xTs/r.  (3.10) 

To  oooclude  this  section,  we  observe  that  our  assumptions: 

(Al)  (P)  has  a  strictly  positive  feasible  solution; 

(A2)  The  set  of  optimal  solutions  of  (P)  is  nonempty  and  bounded;  and 
(A3)  The  matrix  A  has  full  row  rank  k, 

easily  imply  (Bl)  and  (B2)  below,  and  (B3)  is  without  loss  of  generality: 

(Bl)  (D)  has  a  strictly  positive  feasible  solution; 

(B2)  The  set  of  optimal  solutions  of  (D)  is  nonempty  and  bounded;  and 
(B3)  The  matrix  B  has  full  row  rank  i. 

Hence  (PD)  has  a  strictly  positive  feasible  solution,  i.e.  F+(PD)  ^  0,  and  also 
F+(HPD)  t  0. 

4.  POTENTIAL  FUNCTIONS  AND  SCALING 

We  associate  with  the  homogeneous  primal-dual  problem  the  potential  function 

$x,s,r)  =  ^(x,s,r;  g,h)  (4.1) 

:=  (n+p)£n(gTx  +  hTs  -  gThr)  -  E  tn  Xj  -  £  in  Sj  -  (p- n)(n  r 

on  F+(HPD).  Note  that  this  is  homogeneous  of  degree  0,  and  hence  is  defined  on 
(positive)  rays.  Perhaps  the  most  natural  choice  of  p  is  n+1,  leading  to  the  potential 
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function  introduced  by  Karmarkar  [14].  See  also  Ye  and  Todd  [31].  However,  we  will  find 
it  profitable  to  consider  other  values  for  p,  in  particular  p  ~  -jn.  While  it  may  seem 
strange  to  have  a  negative  coefficient  for  (a  r  in  (4.1),  we  note  that  (3.10)  implies 

4>(x,s,r)  =  (n+p)tn  xTs  -  S  in  x.  -  £  In  s,  -  2p  in  r 

J  J 

=  ,/n[^|  (4-2) 

The  first  equation  shows  that  any  p  >  0  is  reasonable,  while  the  second  gives  a  natural 
interpretation  to  p.  Indeed,  x  s/r  is  the  objective  function  of  the  unsealed  solution 
(x/r,  s/r)  in  F+(PD),  while  (xjSj/xTs)  is  a  positive  vector  lying  on  the  unit  simplex. 
Hence  p  balances  a  term  that  measures  the  objective  function  with  the  barrier  function 
that  seeks  to  keep  (x/r,  s/r)  centered  in  F+(PD). 

Our  aim  is  to  secure  an  appropriate  decrease  in  0  at  each  iterate  n.  Tc  this  end 
we  will  work  with  scaled  problems. 

Let  E  be  a  positive  definite  diagonal  n*n  matrix  and  <5  a  positive  scalar.  Define 
the  scaled  data 


A  =  ME,  B  =  ®E_1 
g  =  Gg,  h  =  £E-1h. 


(4.3) 


Then,  in  terms  of  the  rescaled  variables 


x  =  SE  Jx,  §  =  <5Es,  r  =  r 


(4.4) 
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we  have  the  equivalent  problem 

min  gTx  +  hTS  -  gTSr 

(HPD)  Ax  -  Afir  =  0 

8g  -  Bgr  =  0 
x  >  0,  3  >  0,  t  >  0 
(x,  3,  r)  #  0. 

We  let  (PD)  denote  the  corresponding  inhomogeneous  problem,  obtained  by  replacing  f 
by  1  in  (HFD).  It  follows  from  the  definition  (4.3)  that 

A  x  S;  (4.5) 

also  the  objective  function  of  (HFD)  is  6  times  that  of  (HPD)  for  corresponding  points. 
If  we  set 


fe§,r)  :=  0  (X, 3,r,  g,b) 


(4.6) 


Then  we  find 


^(x,3,f)  =  0(x,s,r)  +  p  In  ? 

whenever  the  arguments  correspond  as  in  (4.4).  Thus  to  reduce  the  original  potential 
function,  it  suffices  to  reduce  4>  sufficiently  for  some  appropriately  scaled  problem. 
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We  now  describe  a  particularly  useful  rescaling.  Suppose  we  have  (x,s)  6  F+(PD). 
so  that  (x,s,l)  €  F+(HPD)  with  objective  value  xTs.  Let 

C  :=  xTs/n.  (4.7) 

According  to  lemma  1,  we  may  take  g  *  s,  h  =  x  without  changing  the  problems. 

We  also  suppose  (x,s)  is  approximately  centered  in  F+(PD),  so  that 

||XSe  -  Cell  <  aC  (4-8) 


for  suitable  0  <  a  <  1.  Now  choose 


<5=  C“1/2  and  E  =  (XS-1)1/2. 

(4.9) 

Then,  with  g  =  s,  h  =  x,  we  find 

g  =  h  =  rll2(XS)l/2*  =:  e 

(4.10) 

and 

gTh  =  C_1eTXSe  =  n. 

(4.11) 

Since  e  is  close  to  e,  and  our  current  point  (x,s,l)  has  been  rescaled  to  (e,e,l), 
we  have  simultaneously  scaled  the  primal  and  dual  problems  reasonably  well.  At  the  same 
time,  by  employing  a  symmetric  scaling  for  the  primal  and  dual,  we  guarantee  condition 
(4.5),  which,  as  we  shall  see,  is  crucial. 
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5.  PROJECTIONS  AND  THE  SEARCH  DIRECTION 

We  will  be  working  with  the  scaled  problem  (HFD)  and  the  associated  potential 
function  <fr.  The  scaling  will  usually  be  that  at  the  end  of  section  4,  so  that  g  =  fi,  but  we 
first  compute  the  direction  in  general,  assuming  only  that 

gTfi  =  n  (5.1) 


for  notations!  simplicity;  this  can  always  be  achieved  by  a  scalar  parameter.  We  also 
assume  that  g  and  fi  have  been  adjusted  if  necessary  so  that  our  current  solution  is 


fi  ' 

I 

1 


(5-2) 


We  first  evaluate  the  gradient  of  the  potential  function  at  the  current  point.  For  a 
vector  u  =  (uj)  e  Rn,  it  is  convenient  to  denote 


and  to  write  u 


for  or1)7. 


Then  the  gradient,  using  (4.2),  is 


(5.3) 


g 

ffi"1) 

n  =  vW4,«  =  nr4 

fi 

— 

r1 

o, 

2  P  , 

(5.4) 


The  search  direction  we  will  employ  is,  as  in  projective  methods  for  (P^),  the  projection  of 
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-V^  into  the  tangent  space  of  the  feasible  region.  Hence,  with 


A  0  -Ah 

C:=  0  fi  -6| 

we  seek 

a'  :=  -Pc  V^. 


(5.5) 


(5.6) 


An  important  consequence  of  the  primal-dual  setting  is  that,  with  the  scaling  given  at  the 
end  of  section  4,  3'  (or  its  modification  3  below)  cannot  be  too  short. 

Before  computing  3',  we  remark  that,  if  the  current  point  has  g  =  h  =  e,  r  =  1, 
then  the  homogeneity  of  0  implies  (eT,eT,l)V0  =  0.  Hence  3'  to  also  the  projection  of 
-V#  into  the  null  space  of 


0  -Ah 
S  -fig 
eT  1  ' 

and  this  direction  is  also  appropriate  if  the  "simplex  constraint"  eTx  +  eT§  +  t  =  2n+l  is 
added  to  (HFI5).  A  scalar  multiple  of  3'  is  obtained  if  we  project  the  negative  gradient 
of  the  objective  function  into  this  same  null  space.  However,  if  we  work  with  Vtf>,  the 
extra  appended  row  is  unnecessary.  Omitting  the  simplex  constraint  preserves  the 
homogeneity  of  (HPD)  and  has  been  propounded  strongly  by  de  Ghellinck  and  Vial  [8,9]. 

In  our  context,  adding  a  row  of  ones  changes  the  direction  since  the  current  solution  is 
generally  not  x  =  5  =  e,  r  =  1. 
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We  now  need  to  compute  P^.  An  important  fact  which  follows  from  (4.5)  is  that 

PA  +  Pg  =  I;  (5.7) 

we  use  this  repeatedly.  Let 


A  0  -fi 

T  :=  o  g  ,  u  ,soC  =  [T,Tu]. 


Then 

[PA  0  )  (PB  0  ] 

Prp=  q  p_  ,1—  Pfp=  q  p_  ,  SO 

B  A 

( -P6fi  1 

V  :  (I  PT)  U  —  |-PAg 

Hence 

u>:=(l+vTv)  1  =  (1  +  hTPgh  +  gTPAg)  *.  (5.8) 

Now  from  section  4.2  of  [26]  we  obtain 

Lemma  2- 

PA  0  0  Pgh  Pgh  |T 

Pc=  0  Pg  0  +o>  PAg  PAg  (5.9) 

,0  0  0  1  1 


where  u  is  given  by  (5.8). 
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To  compute  3'  in  (5.6),  we  first  calculate 

[pflS  ffM 

PAg  E  =  gTPgfi  +  |TPa£  =  gTB  =  n 
1  [o 

using  (5.7),  and 

PgE  )T  f  6"1 

P^g  g”1  =  5  TpgE  +  g_TpAg  +  2p 

,1  J  12 p  , 

=  n  +  p  +  o,  where 

o  :=  p  +  g_TPAg  +  E_TpfiE  ”  D-  (5>1°) 

From  (5.4),  (5.6),  (5.9)  and  the  above,  we  find 


‘  pAg  (  PSfi  )  ' 
a-=-PcV*  =  -2±2  PgE  +un  PAg 

.  1  0  )  1  )  . 

fp6fi 

+  utn+p+a)  PA| 

1 

pA?  )  (  Pa£_1  P66 ' 

=  PgE  +  pBr‘  +  «»  Pa*  •  (5.ii) 

.  o  )  l  0  J  1  , 


The  direction  3'  is  appropriate  to  projective  space,  the  domain  of  (HFD).  It  is 
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convenient  to  transform  this  direction  back  to  an  appropriate  direction  for  the 

/ 

inhomogeneous  problem  (PH).  If  we  move  from  our  current  point  a  step  size  x  >n  the 

_  t  _/_/_/ 

direction  d  =  (d„,d.,d  J,  we  reach 

X  b  / 


x 
/  _  / 


fi  +  x  3 

I  +  x  a 

i  +  x'a;j 


=  (i  +  x'ap 


+  x3? 
I  +  xi 


where 


X  :=  x'/(l  +  X  and 

ax  =  ax-a;E, 
as  =  as-d> 


Given  the  homogeneity  of  <t>,  it  is  therefore  equivalent  to  move  a  step  size  x  in  the 
direction 


a*l 

._  n +p 
n 

fPAS 

pas~;  1 
p6« 

f-pAfi  I 

d  = 

^3 

Pgh 

+ 

+  ua 

-pB* 

ar 

0  ■ 

0 

0  ■ 

We  now  note  the  simplifications  that  occur  when,  as  in  (4.10),  g  =  h  =  e.  Then, 
from  (5.1), 
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and  clearly 


eTe  =  n 


rTe  =  n. 


Hence  from  (5.8) 


w  «  (1  +  §tPa§  +  §TP6ef 1  -  (1  +  eTe)  1  =  (n+l)  1 

and  from  (5.10) 


<r  =  />  +  e  TP^e + e  TPg§-n  =  p  +  e  'e-n  =  p. 


T- 


Hence, 


d  =  -(1+0) 

pA?  ] 
Pge 

+ 

pAr‘ 

Pge'1 

0 

0 

(5.13) 


where 


_  2n+l 
^  n(n+l)  P' 


(5.14) 


Again,  if  e  is  close  to  e  and  a  suitable  value  of  ip  is  chosen,  our  primal— dual  setting 
assures  us  that  d  cannot  be  too  small.  Further,  there  is  no  computational  expense  for  this 
increased  dimensionality;  if 


then 


f  :=  -o+tfr)e  +  e  \ 

(5.15) 

dx  =  pAf  and  as  =  f-dx. 

(5.16) 
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To  conclude  this  section,  we  relate  our  direction  3  in  (5.13)  to  those  generated  by 
path-following  methods.  Consider  equations  (2.11).  If  we  scale  as  at  the  end  of  section  4. 
we  can  rewrite  these  equations  in  terms  of 

3  =  <E_1d  .  3  afid. 
x  x’  s  s 


The  last  equation  of  (2.11)  is  equivalent  to  Bd.  =  0.  We  thus  obtain 

S 


(XS)1/2(ax  +  as)  =  -«XSe-pe) 
Aax  =  o 
eas=o. 


The  first  equation  can  be  written 

3X  +  3g  =  -e  +  (p/  C)e  1, 

and  so 

ax  =  pAf’  as  -  p6f  '  <5  I7» 

with 

f  =  -e  +  (m/ C)e  (5. IS) 

The  similarity  to  (5.15),  (5.16)  is  striking.  Indeed,  if  ip  is  chosen  so  that  1  +  ip  =  (///, 
then  the  two  directions  coincide.  If  n  =  ({ l-7/Vn)  for  some  constant  7,  this  suggest 
ip  =  7'/Vn  for  some  constant  y .  Our  choice  of  ip  will  turn  out  to  have  this  form,  but  it 
is  motivated  by  obtaining  suitable  reductions  in  the  potential  function  <p. 
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The  analysis  above  is  appropriate  for  the  symmetric  primal-dual  algorithms  of 
Monteiro  and  Adler  [21]  and  Kojima,  Mizuno  and  Yoshise  [16].  Indeed,  equations  (2.11) 
are  exactly  equivalent  to  equations  (3.1)  of  [21]  with  X  =  X,  Z  =  S  and  corresponding 
changes  of  notation.  Kojima  et  al.  [16]  deal  with  the  linear  complementarity  problem 
w  =  Mz  +  q  where  M  is  positive  semi-definite.  Given  the  linear  programming  problems 
max  £Tx,  Ax  <  b,  x  >  0,  min  bTy,  ATy  >  c,  y  >  0,  as  above  (3.5),  we  set 


w  = 


and  q  = 


b 

-c 


to  get  the  corresponding  linear  complementarity  problem,  where  u  and  v  are  primal  and 
dual  slack  vectors.  Then  if  (P)  and  (D)  are  defined  using  (3.5),  with 


it  is  easy  to  see  that  Kojima  et  al.'s  equations  (2.1),  which  in  the  notation  above  are 


Wdz  +  Zdw  =  ZWe  -/je,  dw  =  Mdz 


can  be  rearranged  to  give  our  equations  (2.11). 

Finally,  Gonzaga's  direction  d^  in  his  algorithm  4.2  [12]  is  of  similar  form  if  his 
scaling  vector  z  is  taken  as  ^^(XS”1)1^  «  x.  Indeed,  in  his  notation 


and  with  the  choice  of  z  above, 


P  =  and  y  =  Z  ^  =  e,  so  Y  *e  =  e  *.  Also, 
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C  s  Zg,  and  by  the  proof  of  lemma  1  its  projection  is  the  same  as  that  of  (e.  Hence 


=  PA("^ek+l)§  +  S  )’ 


which  relates  directly  to  (5.15)— (5.16). 


6.  THE  ALGORITHM 

We  now  have  all  the  ingredients  for  our  algorithm.  We  assume  we  have  available  a 
point  (x®,s®)  €  F+(PD)  that  is  approximately  centered,  so  that 

||X°S°e- (°e||  <  a<°,  (6.1) 


where  C°  =  eTX°S°e/n,  for  suitable  0  <  a  <  1.  Obtaining  such  a  point  has  been 

discussed  in  [12,15,16,21-23,28].  We  generate  a  sequence  {(x^,s^)}  £  F+(PD)  as  follows, 
k  k  k  k 

Given  (x  ,s  ),  set  x  =  x  ,  s  =  s  and  define  the  scaled  data  by  (4.3),  with  6 

k  k 

and  E  given  by  (4.9).  Then  (x  ,s  )  is  transformed  into  (e,e).  We  choose  a  by  (5.13) 
and  set 


for  appropriate  \  >  0.  Now  we  unscale  the  point  to  get 


r1  Ex 

r1  E"1 5 


We  aim  to  prove 
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Theorem  1.  Let  and  ^  ~  ^  a  =  1/3  40(1  X  =  i/15-  Then  if 

(x®,s®)  satisfies  (6.1),  the  algorithm  above  is  well-defined  and  generates  a  sequence  of 
points  satisfying 

||XkSke  -  ^ell  <  af  (6.4) 

where  ^  =  eTXkSke/n  and 

0(xk+1,sk+1,l)  <  ^(xk,sk,l)  —  0  (6.5) 

where  >3  =  1/9. 

The  inequality  (6.5)  implies  the  desired  complexity  of  the  algorithm.  Indeed, 
from  (4.2) 

^(x,s,l)  =  p  £n(xTs)  In 

J  x  s 

and  (6.1)  shows  that  the  last  term  is  bounded  for  (x°,s°)  (and  by  (6.4)  for  all  (xk,sk)). 
Hence,  with  p  ~  -Jn,  (6.5)  gives 

Corollary  1.  In  0(^i  L)  steps,  the  algorithm  above  yields 

eTXkSke  <  2_LeTX°S°e.  (6.6) 

In  the  integer  model,  a  suitable  choice  for  (x  ,s^)  will  then  show  that  0(Vn  L) 
steps  suffice  to  give  a  solution  from  which  optimal  solutions  to  (P)  and  (D)  can  be 
recovered.  See  [12,15,16,21-23,28].  Here  L  is  the  input  size  of  the  instance. 
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In  fact,  our  analysis  below  will  show  (6.6)  directly,  by  proving  that  the  objective 
function  decreases  by  a  factor  (1  —  7/Vn)  for  constant  7  at  each  iteration.  However, 
since  our  algorithm  is  motivated  by  decreasing  the  potential  function  <p,  we  prefer  to  use 
it  as  a  criterion.  The  approximate  centering  shown  by  (6.4)  is  an  automatic  byproduct  of 
the  method. 

Towards  proving  theorem  1,  we  note  that  conditions  (6.4)-{6.5)  are  invariant  under 
scaling  of  the  problem,  so  that  it  is  sufficient  to  show  that  the  move  from  (e,e)  to  (x,5) 
preserves  (6.4)  and  yields  the  potential  function  reduction  (6.5).  Removing  the  overbars 

9 

for  notational  simplicity,  we  only  need  to  establish  the  following  (here  e  denotes  the 
2  2 

vector  with  (e  )•  =  (Sj)  ): 

Lemma  3.  Let  p ,  ip,  a,  \  and  0  be  as  in  the  theorem.  Suppose 

||e2  -e||  <  a.  (6.7) 


If 


d  = 

d*  1 
ds 

=  — (1  +1p) 

PA'  1 
PBe 

+ 

^‘1  ] 
PB6 

dr  J 

0 

0 

(6.S) 


and 


x 

s 


9, 

e 


+  X 


then 


lIXSe  -  Cell  <  oC 


(6.9) 
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with  {  =  xTs/n,  and 


$x,s,l)  <  $e,e,l)  -/?. 


(6.10) 


7.  ANALYSIS 

This  section  proves  lemma  3.  We  carry  out  the  analysis  as  far  as  possible  using 
general  values  for  p,  ip,  a,  x  and  0.  This  will  help  us  discuss  the  choice  of  p,  ip  made. 
Only  when  we  complete  the  proof  of  lemma  3  will  the  specific  values  be  used.  However,  we 
assume  throughout  that  x  is  such  that  (x,s)  >  0.  Since  (e,e)  is  feasible  and  d  (de)  in 
the  null  space  of  A  (B),  this  assures  us  that  (x,s)  is  feasible  also. 

We  now  calculate  some  key  quantities.  As  usual,  we  denote  by  D  and  D 

X  u 

diag(dv)  and  diag(dj  respectively,  and  E  denotes  diag(€). 

X  s 

Lemma  4-  Wehave 


-V4>Td  =  pip  +  e-Te-1  -  n, 

(7-1) 

dTd  =  nil?  +  e-Te-1  -  n, 

(7-2) 

C  :=  xTs/n  =  1  -  x^P,  and 

(7.3) 

XSe  -  Ce  =  (1  -  x(l+^))(e2-e)  -I-  *2DxDse. 

(7.4) 

Proof.  In  the  present  setting,  (5.4)  simplifies  to 


V0  =  H±£ 
n 


e 

e'1 

T1 

e 

— 

.  o  , 

2  P  , 
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Then  (7.1)  and  (7.2)  are  straightforward  consequences  of  (6.8),  using  PA  +  Pg  =  I 
repeatedly. 

Next  note  that  d^d.  =  0,  since  d„  and  de  lie  in  the  null  spaces  of  A  and  B 

X  S  x  s 

respectively,  while 


dx  +  dg  =  -{l +^)e  +  e  (7.5) 

Hence 

C  =  (*  +  *dx)T(e  +  *dg)/n 
=  (eTe  +  *eT(dx  +  dg))/n 
=  (n  +  *(-n(l+tf>)  +  n)/n 
=  l~X^ 

giving  (7.3),  and 


XSe  -  (e  =  (E  +  *Dx)(E  +  *Dg)e  -  (e 

=  e2  +  *E(dx+dg)  +  *2DxDge-  (l-^)e 
=  e2  +  x1  -(l+^)e2  +  e)  +  X2DxDge-  (l-^)e 
=  (1  -  *(l+^))(e2  -  e)  +  X2DxDse, 

establishing  (7.4). 


To  use  Lemma  4,  we  need  to  bound  e  Te  * 


33 


Lemma  5.  If 

||82-e||<a 


with  a  <1/3,  then 


n  <  <  n+1.  (7.6) 

Proof.  For  the  lower  bound,  note  that  e^e  =  n,  while  ||e||  =  V5-  Hence  ||e  ^||  >  Jn. 
Now  since  e  >  0,  we  certainly  have 


||e  -  e||  <  a 

also.  So  ||e  1 —  e||  =  ||E-1(e  -  e)||  <  ||e  -  e||  <  y~ ,  since  each  diagonal  entry  of 

is  at  most  (1  -  a)-1.  Hence  ||e-1-  e||  <  a  +  <  1  since  a  <  1/3.  Finally 

He”1!!2  =  ||e  +  (e-1-  e)||2  =  ||S||2  +  ||e_1-  e||2  <  n+1  since  eT(e_1-  e)  =  n-n  =  0. 

Finally,  we  need 

Lsmm&ii-  If  1  —  Or  —  xl|d||  >  0,  then 

^(x,s,l)  <  0(e,e,l)  +  xV<PT d  +  2(1— ^||d||)  X2dTd.  (7.7) 

Proof.  We  start  with 

\2  \3 

/n(l+A)  =  A  — +  tj—  ... 

-A-!2i*L  =  A'2<HAltA2 


if  | A |  <  1.  Now  suppose  | e  — 1 1  <  a  (so  |e— 1|  <  o  also)  and  |0|  <  ||d||.  Then 


ln(e  +  x$)  =  tne  +  /n(l  +  x  7) 


>  <n  <  +  x  | - 1— j— j 

'  2(1  -  xlfll)<2 


Now  if  f  <  1,  then 


(1-Xl7l)<2  =  <2-X|0|<><2-Xl*l 

>  1  -o-xt|d||, 


while  if  e  >  1,  we  have 


\,2 


(1-Xlfl)«*>  1  — 1-*|#|(1  +  or) 


>  1  -a-x||d|| 


since  y|0|  <  1.  Hence 


tn(<  +  X9)  >  l*  €  +  x  7  -  2(l-a-x||d||)  * 


1  vV 

-.11  Jill  X  V  . 


(7-8) 


Note  that  the  first  two  terms  are  the  first-order  Taylor  approximation  of  fri((  +  \6). 
Now  consider 


0(x,s,l)  =  p  (a(e  x  +  e  s  —  n)  —  E  Xj  -  E  (n  Sj. 


The  first  term  is  concave,  hence  its  first-order  Taylor  approximation  is  an  overestimate. 
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For  the  remaining  terms  we  use  the  approximation  given  by  (7.8),  where  ^  denotes  a 
component  of  e  and  0  a  component  of  dx  or  dg.  Since  the  sum  of  all  such  ft  is  dTd, 
the  result  follows. 


Proof  of  Lemma  3.  By  lemmas  4  and  5,  4  <  dTd  <  5  and  so  ||d||  <  5/2  and  1  -  a  -  x||d|| 
>l-j  — 13*2  —  £•  Thus  each  component  of  x  .  and  of  s  is  positive.  Also, 

X(1  +  ip)  <  1,  so  that,  again  by  lemma  4 


lIXSe  -  Cell  <  (1  -  x(l  +  Wile2  -  e||  +  *2dTd 

<  (1  -  \ip)ot  +  *(xdTd  -  a) 

< 


Finally,  by  lemmas  4  and  6  and  the  above  estimates, 


<0(x,s,l)  <  tf(e,e,l)  +  *V«>Td  + 


1 

2TT=cRMT 


2  .t  . 
t  d  d 


<  0(e,e,l)  —  •  2  + 

=  0(e,e,l)-g. 


/  1  v 


■  5 


(7.9) 


This  completes  the  proof. 

8.  DISCUSSION 

We  now  discuss  the  choice  of  parameters  made.  From  (5.14),  ip  is  about  2p/n,  so 
to  achieve  a  good  reduction  in  the  potential  function  we  would  like  -V0Td  and  hence  p 
large.  To  maintain  feasibility,  we  need  \  =  0(l/||d||).  Let  us  examine  the  consequences 
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of  choosing  p  of  order  n.  Then  ip  is  of  order  1,  -V0Td  of  order  n,  and  j|d||  of  order 
Vn.  Hence  x  is  of  order  1/Vn,  and  the  first  order  change  in  <p  of  order  Vd.  Moreover, 
(7.7)  shows  that  the  error  in  this  first-order  estimate  is  of  order  1.  Hence  we  can  achieve 
a  decrease  in  <p  of  order  (see  also  Ye  and  Todd  [31]).  This  would  seem  to  imply  a 
complexity  of  0(Jn  L),  since  with  p  of  order  n  we  need  a  reduction  of  order  n  in  0  to 
reduce  the  objective  function  by  a  constant;  this  bound  also  appears  to  follow  from  the 
objective  function  reduction  in  (7.3).  However,  such  a  choice  of  p  does  not  seem  to  allow 
the  algorithm  to  remain  approximately  centered,  and  hence  the  improvement  cannot  be 
sustained.  While  the  first-order  term  in  (7.4)  is  very  attractive,  the  second-order  term 
seems  to  be  too  large.  Indeed,  (7.9)  shows  that  we  require  *dTd  —  o  <  0,  and  so  \ 
should  be  0(l/||d||  ).  This  would  give  x  of  order  1/n  and  hence  only  a  constant 
reduction  in  <p  and  a  reduction  in  the  objective  function  of  (1-7/n)  for  constant  7,  as  in 
Karmarkar's  algorithm  or  Kojima  et  al.  [15]. 

On  the  other  hand,  if  p  is  smaller  than  order  Vn,  then  -V<pJd  is  smaller  than 
order  1,  and  we  may  not  be  able  to  achieve  a  reduction  in  <p  due  to  second  order  terms. 
Our  choice  of  p  as  approximately  Vn  balances  these  requirements  nicely. 

We  must  mention  a  disadvantage  of  our  choice  of  parameters.  One  of  the  goals  of 
our  research  was  to  develop  an  algorithm  that  required  only  Vn  L  iterations  as  in  the 
path-following  methods  and  yet  gave  the  possibility  that  line  searches  could  reduce  the 
complexity  further.  While  line  searches  are  possible  in  the  path-following  algorithms  [12], 
the  direction  is  determined  directly  or  indirectly  by  seeking  a  reduction  of  the  objective 
function  by  a  factor  (1-7/Vn)  for  constant  7,  and  it  is  not  clear  that  these  directions 
are  good  for  longer  steps.  Our  directions  arise  from  seeking  reductions  in  a  potential 
function,  and  one  could  hope  that  line  searches  would  be  more  effective. 

Unfortunately,  while  feasibility  demands  only  x  t0  be  0(  1/Hdlj^),  which  might  be 
of  order  Vn,  the  approximate  centering  places  more  stringent  conditions.  Indeed,  (7.4) 
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suggests  that  we  must  maintain  l-*(l+0)  positive  (or  not  too  negative)  which  limits  \ 
to  ordei  1.  It  may  well  be  that  the  inherent  curvature  of  the  "cone"  of  approximate  centers 
does  not  allow  line  searches  to  be  effective.  Then  either  one  must  establish  a  method  that 
successfully  moves  further  down  the  path  of  centers,  or  employ  a  higher-order  predictor  of 
the  path,  as  suggested  by  Bayer  and  Lagarias  [5]  and  Adler,  Karmarkar,  Rcsende  and 
Veiga  [1].  These  remarks  suggest  that  the  bi-directional  search  proposed  by  Tanabe  [25] 
and  Gonzaga  [13],  which  amounts  to  varying  in  the  definition  of  the  direction  d  of 
(5.13),  may  not  be  as  successful  as  hoped. 

Finally,  while  we  hope  that  our  framework  has  shown  the  advantages  oi 
primal-dual  setting  and  approximate  centering  in  projective  algorithms,  the  true 
worst-case  complexity  of  Karmarkar's  primal-only  projective  algorithm  is  still  unknown. 
Are  there  examples  requiring  O(nL)  iterations,  or  can  the  improved  behavior  observed  in 
practice  be  proved  to  hold  in  general?  An  interesting  discussion  relating  to  this  question 
can  be  found  in  McDiarmid  [17]. 
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